This document describes wave-based non-line-of-sight (NLOS) imaging using fast frequency-wavenumber (f-k) migration. It presents a new hardware prototype for room-sized NLOS imaging and interactive scanning. It also introduces a fast wave-based image formation model using the light-cone transform that enables reconstruction in under a second, compared to over an hour for previous methods. The document outlines the f-k migration method and compares it to other NLOS imaging approaches.
9. contributions
• fast wave-based image formation model for NLOS imaging
• complex surface reflectances, more robust to noise
• new hardware prototype: room-sized scenes, interactive scanning
16. wall
hidden object
laser and detector
focus on this point
confocal sampling
histogram
timestamp (nanoseconds)
APJarvis [CC BY-SA 4.0]
lasers and detectors
illuminate and image
same points
36. f-k Migration
Express wavefield as function of measurement spectrum (plane wave decomposition)
wavefield Fourier transform
of measurements
Set t=0 to get migrated solution
Almost an inverse Fourier Transform!
37. f-k Migration
Set t=0 to get migrated solution
Almost an inverse Fourier Transform!
Use dispersion relation1 to perform substitution of variables
1Georgi, Howard. The physics of waves. Englewood Cliffs, NJ: Prentice Hall, 1993.
52. conclusion
• confocal sampling + wave-based models enables fast, robust NLOS
imaging
• fast scanning and outdoor reconstruction is possible with optimized
hardware setup
• moving towards practical NLOS imaging “in the wild”
54. Fermat Paths [Xin et al. 2019]
Xin et al. 2019
Reconstructs surface geometry from
discontinuities in histogram measurements
- surface reconstruction
- works with specular objects
- single object per scene
- noise sensitivity (requires identifying
discontinuities)
Editor's Notes
I’m David Lindell, and I’ll be talking about a wave-based model for non-line-of-sight imaging
In non-line-of-sight imaging, we image objects that are hidden from direct line of sight. To do this we measure the time it takes for short pulses of light to bounce off a wall and scatter back from a hidden object.
These time delays give us the distance from the wall to the hidden object. But we can also use these types of measurements to recover the full 3D geometry.
Here’s an example scene that we captured outdoors using the side of a building. This is a room sized scene that we’re capturing under indirect sunlight
From the perspective of the imaging system, we can only see the wall that we’re illuminating and imaging.
From behind the scene, though, we can observe that the light scatters off of the wall and reflects back from the objects in the scene. This light eventually scatters back to our detector, where we capture the timing information.
The measurements capture an ultra high-speed view of light returning from the hidden objects and scattering against the wall.
From these measurements, we reconstruct the 3D geometry. The reconstruction we get is volumetric, with the intensity of each voxel corresponding to the brightness or albedo of the object.
The key contributions of our work are to introduce a fast wave-based image formation model for NLOS imaging which overcomes some of the limitations of previous approaches based on geometric optics.
In particular, our approach can handle complex surface reflectances and is more robust to noise than previous approaches. We also built a new hardware prototype and demonstrate imaging room-sized scenes and NLOS imaging at interactive rates for the first time.
With these new capabilities we move towards enabling NLOS imaging in more practical applications like autonomous vehicle navigation.
The current LIDAR systems for these vehicles are using basically the same detectors and lasers that we use in our imaging system. However, they are limited to imaging LOS points. Being able to add the 3D information of objects around the corner and outside direct line of sight could help cars navigate more safely.
for the remainder of the talk, I’ll go into more detail about NLOS image formation and related work, explain our wave-based model, and then describe our hardware prototype and some additional results.
To measure the travel time of photons, we use a detector called a single-photon avalanche diode or SPAD
We pair the SPAD with a picosecond laser that sends millions of short pulses of light into the scene every second. Each pulse interacts with the scene and sends some scattered light back to our detector. For each pulse emitted into the scene, the SPAD has a chance at detecting the arrival of a single photon.
When a detection occurs, we get a timestamp of when the photon arrived with respect to when the laser pulse was emitted.
These timestamps are accumulated and binned into a histogram showing how much light is arriving over picosecond time intervals.
In NLOS imaging, we align the detector and the laser so that we illuminate and image the same spot on the wall. The measurements capture the number of photon counts over time. We observe a large peak from the direct path to the wall (which I’ve positioned at time zero), and then some nanoseconds later a secondary peak from the hidden object.
We can also align the laser and detector to illuminate and image different points, which we call non-confocal sampling. In this case, the measured time delay gives us the round-trip distance from the laser point to the hidden object to the detector point.
Alternatively, confocal sampling is illuminating and imaging the same point. With this sampling geometry, the time delay in the measurements directly captures the distance between the wall and the hidden object. Confocal sampling is also nice because LOS LIDARs are designed to illuminate and image the same point.
Confocal sampling is equivalent to having each point on the hidden object emit light, and the resulting wavefront propagates at half the speed of light to the wall. This equivalent, but simplified model enables efficient reconstructions and existing LIDAR systems are designed for confocal sampling.
Confocal sampling is equivalent to having each point on the hidden object emit light, and the resulting wavefront propagates at half the speed of light to the wall. This equivalent, but simplified model enables efficient reconstructions and existing LIDAR systems are designed for confocal sampling.
Here’s an example measurement that we captured where we illuminate and image a spot on the wall. As I move back and forth with this exit sign, we can observe the time difference to the secondary peak.
While illuminating and imaging one point gives the distance to the hidden object
We can also scan a patch on the wall to capture a 3D measurement volume.
we use these measurements to reconstruct 3D geometry.
Using geometric optics, the measurements and hidden volume can be related through a transport matrix, A.
Backprojection and direct inversion techniques have been proposed for reconstruction, but have computational requirements that scale as N^5. For example, for a low-resolution volume where n=100, A has 1 trillion elements. For n=1000 we would need petabytes of memory to store the matrix. But even matrix free implementations have intractable computational requirements.
(replace “A” with latex font)
Thus these techniques are largely intractable for high-resolution reconstructions
To address this, another approach uses confocal scanning and the light-cone transform or LCT. This technique transforms the measurements to a domain where the reconstruction can be performed with deconvolution.
While this method is fast, it is limited to diffuse and retroreflective hidden objects.
Instead of using a geometric image formation model, we rely on a wave model based on the time-dependent wave equation. This model extends to varied surface reflectances and we can solve the reconstruction without sacrificing any computational speed.
Wave-based models have been used in the computer graphics community for simulation, rendering, and sound. But wave based models have also been used for radar, sonar, and seismic imaging. This is incredibly useful because by recognizing the connections to these domains we can leverage decades of research for this new application of NLOS imaging.
One example of where this wave-based model is used…
is in seismic imaging, which turns out to have many similarities to NLOS imaging. Here, shockwaves caused by explosions on the surface cause waves to propagate downward, reflect off of hidden underground surfaces, and are picked up by receivers at the earth’s surface. The measurements are used to reconstruct the underground structures.
In optical non line of sight, the laser illumination is like the shockwave which is reflected back by the hidden object.
We can model the confocal measurements as the value of a wavefield over time that starts from the surface of the hidden object at time zero and propagates to the observed wall. I’ll show the whole wavefield over x,z, and time on the left, and then the captured measurements evaluated at the wall on the right.
Our goal is then to migrate the wavefield from z=0 back to t=0 to recover the surface of the hidden object. One way to do this is to time-reverse the measurements and solve the differential equation using finite difference techniques, but this is extremely expensive to compute. For example, this simple 2d simulation took 20 minutes to compute.
the finite difference technique consists of discretizing the measurement domain, and then approximating the differential wave equation with finite differences. If we do this, we can solve for the next time-reversed step in terms of our current and previous measurement steps. This corresponds to an update step that we repeatedly compute and apply until we reach t=0.
In general, though, this iterative method is slower than necessary if we only need to recover the value at time zero.
Another recent approach to solving this migration problem was called phasor fields. In this approach, the measurements are convolved with a virtual wave function– a wavelet for example– to simulate the scene response to a known wave. Then the resulting wavefield is propagated backwards using Rayleigh sommerfield diffraction or Fresnel propagation. The benefit here is that you can also propagate to arbitrary time values; the computational requirements depend on the propagation method and can be as fast as n^3 log n
Our approach adapts f-k migration which is used in radar and seismology to efficiently go from one boundary condition to another. In this case, we can directly migrate the measurements to the hidden surface. This approach also has the benefit of being fast and uncomplicated.
Now I’m going to show this for captured measurements in 3D. On the left is an image of the room-sized hidden scene with different surface reflectances. On the right are the captured measurements. The concentrated streaks at the beginning of the volume come from the glossy dragon, the bright spots are specular reflections from the discoball, and the broad elongated streaks at the most distant part correspond to diffuse reflections from the statue.
I’ll play back the measurements as an ultrafast video, and we can take a minute to watch the ripples of light that we observe at the wall from the hidden scene.
(fade in music)
Here’s how the reconstruction works. We take the fourier transform of our measurements and resample. Then, an inverse fourier transform reconstructs the hidden volume. Let me focus on why this resampling step works.
We express the wavefield as a function of the measurement spectrum or as an integral over plane waves at different frequencies. To get the expression for the migrated wavefield, we set t=0.
The resulting integral expression gives us the surface of the hidden volume but is expensive to compute numerically. However, we see that it is almost an inverse fourier transform except for the spatial and temporal frequency variables do not match
We can use the dispersion relation to massage this equation into the right form. This relation can be derived from the wave equation or found in physics textbooks and relates the spatial and temporal frequencies of electromagnetic waves.
The temporal frequency of the wave is given by one over the period
its spatial wavenumber is given by 2 pi over the wavelength
the dispersion relation describes how fast waves at one temporal frequency propagate in all spatial directions.
and so this equation describes exactly how to resample the hidden volume
this allows us to convert from temporal frequency f to spatial frequency k_z by rearranging the fourier coefficients
After the substitution of variables, we find that the migrated wavefield is given by an inverse fourier transform, and this gives us our reconstructed volume.
Here’s the reconstruction using f-k migration, we can even reconstruct the position of the disco ball, which sits on the edge and sticks out from the shelf.
These methods are really designed for diffuse reflectances so they contain artifacts in the reconstruction, f-k migration produces a cleaner result.
need to highlight key contributions, key differences, add outline (1. image formation model, 2. hardware, etc.)
We captured additional results with a new hardware prototype that is shown here.
The main part of the prototype consists of a high-power laser and spad which share an optical axis. Two scanning mirrors allow us to scan a wall or other scan surface to probe the hidden scene.
Here’s a video of the hardware prototype in action, the laser sends out 35 ps laser pulses 10 million times every second.
We also wanted to see how fast we could push the scanning of this system, and so I dressed up in a retroreflective suit to increase the amount of available signal. By scanning at 4 Hz, we can capture human pose and location at low resolution from around the corner.
Our empirical results show that f-k migration is robust at low exposure settings. The LCT result is more noisy, partly due to it not modeling the glossy surface. Also, f-k migration requires no parameter tuning.
The improved quality of f-k migration is also apparent here, especially compared to filtered backprojection for a 10 min. exposure time.
Since a confocal scanning arrangement can be somewhat restrictive and because not all scan surfaces are planar walls, we also derive in the paper computational corrections to apply f-k migration in these scenarios. Additionally, we show how phase retrieval techniques can be used to improve reconstruction results.
(add concluding slide)
in terms of cost, our hardware prototype is expensive but about the same as any other SPAD-based setup. However, these costs will probably come down as more commercial lidar systems emerge.
We reconstruct hidden volumes with a size of 512^3 voxels we also have data at 1024^3 resolution. However, at these resolutions for a 2x2x2 m volume we’re sampling at or above fundamental resolution limits for diffuse objects that we derived in our 2018 paper published in Nature. Higher resolutions can only be achieved with specular or retroreflective objects or priors that hallucinate details.
handling occlusions could improve reconstruction quality, it comes at a cost for runtime and memory. Our method could also be implemented on a GPU for real-time reconstruction.
in terms of cost, our hardware prototype is expensive but about the same as any other SPAD-based setup. However, these costs will probably come down as more commercial lidar systems emerge.
We reconstruct hidden volumes at 512^3 resolution and we also have data at 1024^3 resolution. However, at these resolutions we’re sampling at or above fundamental resolution limits for diffuse objects that we derived in our 2018 nature paper. Higher resolutions can only be achieved with specular or retroreflective objects or priors that hallucinate details.
While handling occlusions could improve reconstruction quality, it comes at a cost for runtime and memory. Our method could also be implemented on a GPU for real-time reconstruction.
in conclusion, we’ve demonstrated NLOS imaging based on confocal scanning and wave-based models that is fast and more robust than other methods
We’ve also demonstrated outdoor reconstruction and interactive scanning for the first time
and take steps towards making NLOS imaging practical in everyday scenarios for applications like autonomous vehicle navigation
We’ve also made our code and data available on our project webpage.
At CVPR the best paper award was given to a paper that related the principle of fermat paths to nlos imaging. This works by identifying discontinuities in the histograms of photon counts and using these to recover a hidden object surface.
This technique is really unique in being able to work on surfaces and surface normals rather than a volumetric representation. It also works with specular objects.
However, in practice this technique has only been shown with a single object per scene, and it may be more difficult to have this work with many separate objects.
Since the technique requires detecting discontinuities in the measurements, it may also be more sensitive to noise than our approach.
It may also be possible to use both methods to further improve reconstruction quality.